Template for test
In [ ]:
from pred import Predictor
from pred import sequence_vector
from pred import chemical_vector
Controlling for Random Negatve vs Sans Random in Imbalanced Techniques using S, T, and Y Phosphorylation.
Included is N Phosphorylation however no benchmarks are available, yet.
Training data is from phospho.elm and benchmarks are from dbptm.
Note: SMOTEEN seems to preform best
In [ ]:
par = ["pass", "ADASYN", "SMOTEENN", "random_under_sample", "ncl", "near_miss"]
for i in par:
print("y", i)
y = Predictor()
y.load_data(file="Data/Training/k_acetylation.csv")
y.process_data(vector_function="sequence", amino_acid="K", imbalance_function=i, random_data=0)
y.supervised_training("bagging")
y.benchmark("Data/Benchmarks/acet.csv", "K")
del y
print("x", i)
x = Predictor()
x.load_data(file="Data/Training/k_acetylation.csv")
x.process_data(vector_function="sequence", amino_acid="K", imbalance_function=i, random_data=1)
x.supervised_training("bagging")
x.benchmark("Data/Benchmarks/acet.csv", "K")
del x
In [ ]: